Euler Circuits, Dna Sequencing by Hybridization, and a New Graph Polynomial That Counts Euler Circuit Decompositions

نویسنده

  • Richard A. Arratia
چکیده

I will survey in this talk two papers written in collaboration with Bela Bollobas, Don Coppersmith, and Greg Sorkin. The second of these papers studies how Euler circuits for a type of Eule-rian, directed graphs, can be counted by a recursion relation. Generalizing this recursion relation deenes a polynomial, the interlace polynomial, on any undirected graph. We introduce this new polynomial and explore some of its properties. In the rst paper surveyed in my talk, I present some new results related to the sequencing by hybridization, a method of reconstructing a long DNA string |that is, guring out its nucleotide sequence| from knowledge of its short substrings. Unique reconstruction is not always possible, and the goal of this paper is to study the number of reconstructions of a random DNA string, under an appropriate probabilistic model. For a given string, the number of reconstructions is determined by the pattern of repeated substrings; in an appropriate limit substrings will occur at most twice, so the pattern of repeats is given by a pairing: a string of length 2n in which each symbol occurs twice. A pairing induces a 2-in,2-out graph, whose directed edges are deened by successive symbols of the pairing |for example the pairing ABBCAC induces the graph with edges AB, BB, BC, and so forth| and the number of reconstructions is simply the number of Euler circuits in this 2-in 2-out graph. The original problem is thus transformed into a question about pairings: how many n-symbol pairings have k Euler circuits? We show how to compute this function, in closed form, for any xed k, and we present the functions explicitly for k = 1 to 9. The key is a decomposition theorem: the Euler \circuit number" of a pairing is the product of the circuit numbers of \component" sub-pairings. These components come from connected components of the \interlace graph", which has the pairing's symbols as vertices, and edges when symbols are \interlaced". (A and B are interlaced if the pairing has the form We carry these results back to the original question about DNA strings, with a total variation distance upper bound 1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Interlace Polynomial of Graphs at - 1

The study of Euler circuits of directed graphs related to DNA sequencing [1] inspired Arratia, Bollobás and Sorkin [2] to introduce a new graph polynomial satisfying a striking recurrence relation. Although in [3] a fair amount is proved about the interlace polynomial, it is still a rather mysterious graph invariant. The aim of this note is to shed more light on the interlace polynomial by prov...

متن کامل

Optimal Euler Circuit of Maximum Contiguous Cost

This paper introduces a new graph problem to find an Optimal Euler Circuit (OEC) in an Euler graph. OEC is defined as the Euler circuit that maximizes the sum of contiguous costs along it, where the contiguous cost is assigned for each of the two contiguous edges incident to a vertex. We prove that the OEC problem is NP-complete. A polynomial time algorithm will be presented for the case of a g...

متن کامل

The Interlace Polynomial : a New Graph Polynomialrichard Arratia

LIMITED DISTRIBUTION NOTICE: This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and sp...

متن کامل

Graph Traversals, Genes, and Matroids: An Efficient Case of the Travelling Salesman Problem

In this paper we consider graph traversal problems (Euler and Travelling Salesman traversals) that arise from a particular technology for DNA sequencing sequencing by hybridization (SBH). We first explain the connection of the graph problems to SBH and then focus on the traversal problems. We describe a practical polynomial time solution to the Travelling Salesman Problem in a rich class of dir...

متن کامل

Binary nullity, Euler circuits and interlace polynomials

A theorem of Cohn and Lempel [J. Combin. Theory Ser. A 13 (1972), 83-89] gives an equality involving the number of directed circuits in a circuit partition of a 2-in, 2-out digraph and the GF (2)-nullity of an associated matrix. This equality is essentially equivalent to the relationship between directed circuit partitions of 2-in, 2-out digraphs and vertexnullity interlace polynomials of circl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007